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[57] ABSTRACT 

A method, system, and computer program product are 
provided for visually approximating a scatter plot. Bins of 
scattered data points are formed. Each axis of a scatter plot 
is discretized according to a binning resolution. Bin posi- 
tions along each discretized scatter plot axis are determined 
from the bin numbers. The bins, which represent a cloud of 
scattered data points, are volume rendered as splats. The 
opacity of each splat is a function of the number (count) of 
data points in a corresponding bin. In one example, the 
opacity of a splat is determined by the following equation: 

opacity=3 -exp (_ " ' counf > t 

where count is the number of scattered data points in a 
corresponding bin, u is a global scale factor which can be 
varied by a slider, and exp is an exponential function. The 
color of the splat represents a data attribute associated with 
the data points in a corresponding bin. For example, the 
color of the splat is a function of the average value of an 
external data attribute associated with the scattered data 
points. Splats are rendered in a sorted back to front (or front 
to back) order. An opaque dragger object permits a user to 
select different regions inside a splat plot. Arbitrarily large 
numbers of scattered data points can be rendered quickly. 
Speed depends upon binning resolution not the number of 
data points. 

26 Claims, 7 Drawing Sheets 
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METHOD, SYSTEM, AND COMPUTER associated with the data points in a corresponding bin. The 

PROGRAM FOR VISUALLY data attribute® which is mapped to splat color can be an 

APPROXIMATING SCATTERED DATA external data attribute(s) not represented by any scatter plot 

axis. 

BACKGROUND OF THE INVENTION 5 In one embodiment, the opacity of a splat at its center is 

1. Field of the Invention determined by the following equation: 

Hie present invention relates to visualizing scattered data opacity- l-exp^^O where count is the number of 

points using a computer display. scattered data points in a corresponding bin, u is a global 

2 R 1 t d Art SCale ^ aclor ' and ex P denotes aD exponential function. A 

' e a e 10 slider can be provided to vary u so that the overall opacity 

Computer visualization tools are called upon to handle 0 f a scatter plot image is altered. The color of the splat is a 

ever increasing amounts of data. Conventional scatter plots function of an aggregate value (e.g. an average value) of an 

visually represent multivariate data points as graphical external data attribute associated with the scattered data 

glyphs plotted along one, two, or three axes. Each data point points. Splats are preferably rendered in a back to from (or 

has one or more data attributes, also called variables. These 15 front to back) order, that is, sorted based on distance from 

data attributes can be numerical or categorical. Each axis can the eye point. 

represent a different data attribute. Additional data attributes Th e present inV ention allows arbitrarily large numbers of 

can b^repr esented by varying the color or size of the glyphs. scattered data points to be rendered quickly. Speed depends 

Problems are encountered*ih'"visualizing-sc"Stfefed data upon resolution of the binning not the number of data points, 

when the number of data points is large. In general, each data 20 A splat plot is attained which visually approximates a cluster 

point in a conventional scatter plot is represented by a of data points in a scatter plot. The opacity of a splat visually 

corresponding glyph. As the number of scattered data points represenjs e the-density of data points in a corresponding bin. 

increases, more glyphs crowd a scatter plot display. The time Foj^umform blns^ rje density can be determ ined based on the 

it takes to render each glyph increases. The time it takes to mmSeT^count) of data points in a corresponding bin. I he 

build and display a scatter plot can become too long, thereby, 25 color of the splat visually represents a data attribute asso- 

precluding interactive, on-the-fly rendering of scattered ciated with the data points in a correspanaTn^rrirT~Nuarhy^J 

data. Occlusion canjd&p_occur-as-data-pointeJn the.fore- composited-~splat~6verlap producing ^"smooth volume I 

^gKUind^o^arscatter plorhide~data p oints beh ind-thenx A i ^£p_~i n -a ^T££p5cr — ~ ^ ■ 

serious problem occurs when many data points occupy the ^-Accordingjo-a further feature of the present invention, a 

same location. 30 dragger object is displayed that permits a user to select 

To illustrate the above problem, consider a two- diu^renUregions-inside--.a^sp,lat plot.^ Information about 

dimensional scatter plot containing millions of data points. selected regions can then be displayed. 

It takes a very long time for a graphics processor to draw Further features and advantages of the present invention, 

millions of glyphs covering all these data points. If each data as well as the structure and operation of various embodi- 
point is represented. by.,a,single,pjxelj3 n the ^Jffn^JhenJ* 5 ments of the present invention, are described in detail below 

there^wiU^be^many overlapping data points./Only the data with reference to the accompanying drawings. 

£^ ' ^ PiXCl BRIEF DESCRIPTION OF THE DRAWINGS 

Hie same problems occur in three-dimensional scatter „ ^ file of this P atent a PP lication contains at least one 

plots where three-dimensional (3-D) glyphs (e.g., cubes, drawing executed in color. Copies of this patent with color 

spheres, etc.) are used to represent data points. These 3-D drawi "g(s) will be provided by the Patent and Trademark 

glyphs are plotted with respect to three scatter plot axes. 0ffice u P on re q uest and payment of the necessary fee. 

Rendering such a 3-D scatter plot for large numbers of data ™ e accompanying drawings, which are incorporated 

points can take a long time, as many glyphs must be nerein and form P art of lne specification, illustrate the 

processed. Moreover, if there are many data points to be 4 P re sent invention and, together with the description, further 

covered, glyphs in the foreground occlude those in the back. serve to ex P lain the principles of the invention and to enable 

Also, data is hidden when the data points are clustered a P erson skilled in the pertinent art to make and use the 

together. There is no easy way to examine data inside a invention. 

cluster. FIG. 1 is a flowchart showing a routine for visually 

What is needed is a data visualization tool that visually approximating scattered data according to the present inven- 

approximates a scatter plot when a large number of data tl0n ' 

points needs to be drawn. Y\Q. 2 is an example color image of a two-dimensional 

splat plot that uses splats to visually approximate scattered 

SUMMARY OF THE INVENTION ^ data according to the present invention. 

The present invention provides a method, system, and FIG 3 is an example color image of a three-dimensional 

computer program product for visually approximating a s P lat P lot that uses s P lats t0 visually approximate scaltered 

scatter plot. Through a binning process, bins of scattered data a ccording to the present invention, 

data points are formed. Each axis of a scatter plot is FIG. 4 is a close-up view of the color image of FIG. 3 

discretized according to a binning resolution. Bin positions 60 farther showing textured splats. 

along each discretized scatter plot axis are determined from FIG. 5 shows an example graphics computer system for 

bin numbers. executing the routine of FIG. 1, 

According to the present invention, the bins, which rep- FIG. 6 is a graph showing an opacity function used in the 

resent a cloud of scattered data points, are volume rendered present invention for large and small global scale factors, 

as splats. The opacity of each splat is a function of the 65 FIG. 7A is an image of an example Gaussian texture thai 

number (count) of data points in a corresponding bin. The can be texture mapped by a graphics engine to form a 

color of the splat is based on one or more data attributes textured splat according to the present invention. 
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FIG. 7B is a graph of the opacity of the Gaussian texture 
in FIG. 7A. 

The present invention is described with reference to the 
accompanying drawings. In the drawings, like reference 
numbers indicate identical or functionally similar elements. s 
Additionally, the left-most digit(s) of a reference number 
identifies the drawing in which the reference number first 
appears. 

DETAILED DESCRIPTION OF THE 10 
EMBODIMENTS 
1. Overview and Terminology 

The present invention provides a new data visualization 
tool that visually approximates a scatter plot. Bins, repre- 
senting clouds of scattered data points, are volume rendered 15 
as splats. The opacity of each splat is a function of the 
density of data points (e.g. the count or number of data 
points) in a corresponding bin. The color of the splat 
represents a data attribute associated with the data points in 
a corresponding bin. 20 

The following terms are used to describe the present 
invention: 

"Data," "data points," "scattered data," "multivariate 
data," and equivalents thereof, are used interchangeably to 
refer to a data set having corresponding data attributes (also 25 
called variables) that are suitable for a multivariate data 
visualization, such as, a scatter plot. One data point can 
contain multiple data attributes. Data attributes are repre- 
sented as numerical or categorical variaMe^in^each^axisj)f 
a scaJter_plo,t^Numerical variables can-include any type of 30 
numerical value or range (e.g. real numbers or integers). 
Categorical variables have nominal values like text strings. 
For example, a data attribute representing color can include 
the following categorical variable values: "red," "blue," and 
"orange." Numerical values can also be assigned to each 35 
categorical variable value for sorting and other operations 
(i.e. "red" can be set to 1, "blue" can be set to 2, and 
"orange" can be set to 3). 

Al-D scatter plot has one axis plotting one variable. A 
2-D scatter plot has two axes plotting two variables. A 3-D 40 
scatter plot has three axes plotting three variables. Any type 
of data can be used, including but not limited to, business, 
engineering, science, and other applications. Data sets can 
be received as data records, flat files, relational or non- 
relational database files, direct user inputs, or any other data 45 
form. 

"Binning" refers to any conventional process for aggre- \ 
gating scattered data points into bins. Bins can be made up \ 
of uniform and/or non-uniform clusters of data points. J 
""Splat" (also called a footprint) refers to any transparent 50 
shape used to build a transparent volume. For example, / 
splats, when composited in a back to front order relative to 
an eye point (or a front to back order), can be used to 
reconstruct transparent volumes. 

Splats used in the present invention can include, but are 55 
not limited to, Gaussian splats. A Gaussian splat is one that 
^is most opaque at its center and approaches zero opacity, 
' according to a Gaussian function in every radial direction. A 
Gaussian splat is typically approximated with a collection 
Gouraud shaded triangles, or more accurately, as a texture 60 
mapped polygon (e.g. rectangle). 

^Splats used in the present invention can also include, but 
are not limited to, the examples of splats described in the 
following articles (each of which is incorporated by refer- 
ence herein): L. Westover, "Footprint Evaluation for Volume 65 
Rendering", Proceedings of SI G GRAPH '90, Vol 24 No 4, 
pp 367-376; Lauer and Hanrahan, "Hierarchial Splatting: A 



Progressive Refinement Algorithm for Volume Rendering," 
Computer Graphics, vol. 25, No. 4, July 1991, pp. 285-289; 
and Crawfis and Max, "Texture Splats for 3D Scalar and 
Vector Field Visualization", Proceedings of Visualization 
1993, p 261-265. For instance, a splat can be drawn as a 
collection of Gouraud shaded triangles (see, e.g., the Lauer 
and Hanrahan 1991 article), or as texture mapped rectangles 
(see, e.g., the Crawfis and Max 1993 article). 

FIG. 7A shows an example of a Gaussian texture 700 thai 
can be texture mapped by a graphics engine to form a 
textured splat. FIG. 7B is a graph of the opacity of Gaussian 
texture 70QJ1 lusXrjiting-trie^va n atia n i n o p aciTy"l'ro m.a' pe ak 
anhTcel Tter.to z^ o ^ac^rdi ng^jo^-Gaussian' function. 
2TExamp]e Environment ~~ 

The'present invention^dos^n^d in terms of an example 
computer graphics anefdata mining' environment. Given the 
description herein, it wtHild-be-fjovious to one skilled in the 
art to implement the present invention in any general com- 
puter including, but not limited to, a computer graphics 
processor (single chip or multiple chips), high -end to low- 
end graphics workstations, virtual machine (e.g. Java- 
created application), and network architectures (e.g., client/ 
server, local, intermediate or wide area networks). In one 
preferred example, the present invention can be imple- 
mented as software, firmware, and/or hardware in a data 
mining tool, such as, the Mineset product released by Silicon 
Graphics, Inc., and executed on a graphics workstation 
manufactured by Silicon Graphics, Inc. (e.g., an Indigo 2 , 
Indy, Onyx, or 0 2 workstation). A further example computer 
system is described below with respect to FIG. 5, but is not 
intended to limit the present invention. 

Description in these terms is provided for convenience 
only. It is not intended that the invention be limited to 
application in this example environment. In fact, after read- 
ing the following description, it will become apparent to a 
person skilled in the relevant art how to implement the 
invention in alternative environments. 
3. Visually Approximating Scattered Data 

FIG. 1 shows a routine 100 for visually approximating 
scattered data according to the present invention. For clarity, 
the steps of routine 100 will be described in general terms 
and with reference to a specific example. The specific 
example uses a sample ore data set of data records related to 
ore samples extracted from a mining site, as shown in the 
table below: 

TABLE 1 





Sample Ore Data Set 




Longitude (x) 


Latitude (y) 


Depth (z) 


Value (color) 


6586.435 


21866.457 


9849.911 


0.01 


6585.729 


21866.958 


9850.411 


0.01 


6585.023 


21867.459 


9850.911 


0.01 


6584.337 


21867.961 


9851.411 


0.02 


6568.526 


22281.813 


10028.994 


2.35 


6628.461 


22281.813 


10090.753 


0.14 


6650.017 


22281.834 


10094.368 


0.02 


6631.844 


22281.848 


10152.818 


0.03 


6599.928 


22281.867 


10067.001 


0.05 


(8191 rows) 









Each data record has four attributes (longitude, latitude, 
depth, and value) characterizing each ore sample. The first 
three attributes represent the x,y,z location (longitude, 
latitude, depth) of an ore sample within the mining site. The 
fourth attribute (value) gives an indication of the quality of 
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Binned Sample Ore Data Set 




5 


Longitude- 


Latitude- 


Depth- 








Bin 


Bin 


Bin 


Value 






0 


15 


17 


0.02 


6 




0 


21 


12 


0.02 


3 




1 


20 


0 


0.02 


1 


10 


1 


21 


12 


0.0225 


4 




2 


14 


17 


0.0266 


3 




2 


15 


17 


0.01 


1 




2 


20 


0 


0.027 


7 


15 


42 


49 


37 


0.03 


3 




43 


48 


36 


0.04 


1 




49 


49 


36 


0.01 


1 




(1743 rows) 
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the ore in the sample taken at that location. These four 

attributes are illustrative, Each data record can have many TABLE 2 

additional attributes. 
A. Binning 

In step 110, each scatter axis variable is discretized 
according to a binning resolution to form bins. In general, 
any conventional binning technique (uniform or non- 
uniform) can be used to bin numeric (i.e. real-valued) 
attributes and categorical attributes. If a categorical attribute 
is mapped to an axis then the binning is defined to be the 
distinct values of that attribute, or some grouping of these 
values based on metadata. 

Preferably, uniform bins are created for each variable (or 
data attribute) that is mapped to a scatter axis. Non-uniform 
bins can also be formed. In one simple example, each axis 
of a scatter plot can be discretized into k bins, where k is a 
positive integer. In a 2-D case, k a * 1^ bins (or "buckets") are 
available for aggregating data points on two respective ^ The binning resolution here is arbitrarily chosen to be k=50 
discretized axes. In a 3-D case, k, * k 2 * k 3 bins are available making rendering about ten times faster than a scatter ploi 
for aggregating data points on three respective discretized using all of the data points. A two-dimension example would 
axes. Binning is performed as part of pre-processing to on ly nee d two bin position data attributes (e.g., any two of 
reduce processing during rendering. longitude, latitude, or depth). One-dimension example 

Bin positions are then determined from the bins (step 25 ^ °^ b |" p0Sltl0n data atlnbute 

mtw d 1 /» , c . . , , longitude, latitude, or depth). 

120). Bm positions define the order of bins along each B Rendering Splats 

discretized axis and can be determined from bin numbers Nextj in st 150 ]ats representative of the 5ins are 
associated with the bins. For a numeric attribute, the bin rendered in a graphia , engine Asplat ^ dfawn al each bin 
numbers are determined from the discretized real-values, location to form an image that visually approximates an 
that is, sorting the bins based on the discretized real-values 30 original plot of the data Splats are rendered in a 

and determining corresponding bin positions. For a categori- back-to-front order (or front-to-back order) during compos- 
cal attribute, the bin numbers are determined from the iting such that splats located furthest from a display screen 
distinct values of that attribute. The order of bins (and are rendered before splats located closer to a display screen, 
corresponding bin positions) along a discretized axis can be For each bin, the count of scattered data points aggregated 

determined by sorting the distinct categorical values in any 35 in the bin is mapped to a splat opacity (step 160). In one 
number of different ways. For example, sorting methods can example, a.graphics^engine^e^ 
include, but are not limited to, sorting based on alphabetical -across a p olvgo nJO-represenLa-Splat — 
or numeric order, sorting based on count, or sorting based on 1° one preferred embodiment, the splat opacity is a 
an aggregate value (e.g. average) of the attribute mapped to function of the count of aggregated data points in a corre- 



color. sponding bin as determined by the following equation: 

A count of the number of scattered data points aggregated 

into each bin is determined (step 130). An aggregate value opacity-i-exp(-u*count), 

that represents a data attribute of the scattered data points in where, opacity represents the opacity value of a splat at its 

a bin is determined for each bin as well (step 140). The 45 center, count represents the count of aggregated data points / 

aggregate value in one preferred example is an average in a corresponding bin, u represents a global scale factor, and / 

value of a data attribute of scatter data points in a bin. The exp^dfen^tes^an-exponentia JTunct^ n (such as, an exponential / 

aggregate value can also be a minimum, maximum, median, function having a natural logarithm* base e). The^above^ / 

count, or any other value representing a data attribute of ex^n^ntia^op^dty^r^tionj s^effectiye in mo deling Jight / 

scatter data points in a bin. 50 prop^atioruhrough clouds of light emittim^spheres. / 

The aggregate value can represent an external variable not /^T^TTT^ ^"^^ V -Jn^ajue (/ 

, 4 ■ j * * ui *l * * j of the g lobal sca le factor u. Ji n is allows globally scaling of 

mapped to an axis or a data variable that is mapped to an ,7 f ' ' l V. — 1 • • e 

• m. . ... , . * . ~., . lne opacity tor each splat to make an entire display image of 

axis There could be multiple value columns in Tables 1 rendered ^ mQK ^ [ess tr em This ^ b ^ the 

and/or 2, each value column representing a different data S5 gk)bal scale fact0f) whi]e impacting tne emire { is not 

^attribute. In the splat plot visualization as described below, linear Asp i at > s opacity is sca]ed differently depending upon 

/ it is a simple matter to select among the value columns for its count, that is, the number of daia points the splat 

jj purposes of mapping the color without doing any additional represents. 

% computation. FIG^SJs-a-graph _showin g the above ^opac-itv-function 

A data structure can be created to store bin position, count, 60 using large.and.small-global-scaleJactors.Jn.par.ti.cuJ.a^j he 

and value data for each bin as determined in steps 120 to ^opa^city ^alue for a splat is an ex r^jienti al runctiop of the 

140, respectively. For example, a new table having records count of the scatter ed data ^po in ts.that approaches an asymp- 

corresponding to bins and data attributes representative of totic1imin~f^a^e_counts,J^ seT to 

the bins (e.g. bin position, count, and value of an external a value (i.el large or small) for a particular image. As shown 

attribute) can be created. An example new binned table 65 in FlG~67^hl£n~Tla7ge^lo used, each 

drawn from the sample ore data set of Table 1 for three- splat's o pacit y approaches the asymptotic limit 1 -morcj-^- / 

dimensions (longitude, latitude, and depth) is shown below: quickly~(for lower^^coimts)*than~a~small-globa]"scare factor. S 
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In step 170, the value representative of a variable asso- in the reprinted article by Levoy, "Display of Surfaces from 

ciated with the aggregated data points in a respective bin Volume Data/' IEEE Computer Graphics and Applications, 

determined in step 140, is mapped to a color value. For vol. 8, No. 5, May 1988, pp. 29-37 (reprinted pages 135-43) 

example, a color transfer function can be used to map an (incorporated herein by reference). 

average value of a numeric external variable for each bin. 5 For cell projection, cubes are constructed by using bin 

Each splat is then rendered with a color value that is a centers as vertices. Again, if a bin with no data is needed as 

function of the external variable associated with the aggre- a vertex to complete a cube, that vertex will have an opacily 

gated data points in a respective bin. equal to zero. The opacity assigned to the vertices is then a 

Finally, the splats are composited to form a volume function of the count of scattered data points in a corre- 

rendered image on a display screen (step 180). The image 10 sponding bin, according to the following equation: 
includes the rendered splats with opacity and color deter- 
mined according to steps 150-170. The splats are plotted 

along discretized scatter axes at bin positions determined in opacity-i-expf-u-count), 

step 120. In this way, the volume rendered image is a splat where, opacity represents the opacity value of a cube, count 

rfot^toFvisually approximates the scattered data points. 15 represents said count of data points in a corresponding bin, 

(FIG. 2| shows an example color image of a two- u represents a global scale factor, and exp denotes an 

dimensional splat plot 200 that uses splats to visually exponential function. The transparent cubes are rendered 

approximate scattered data according to the present inven- directly or by using tetrahedral decomposition in a back to 

tion. Splat plot 200 was generated using the sample data front order. See, e.g., the use of cell projection in volume 

described above with respect to Tables 1 and 2. 20 rendering in the articles by Wilhelms and Van Gelder, "A 

A^r^y^indow^205 includes a slider 210 and two Coherent Projection Approach for Direct Volume 

thumb wheels Rotx, Roty^fotmanipulating.the^orientation of Rendering," Computer Graphics, vol. 25, No. 4, July 1991, 

the two-dimensional splat scatter plot 200. Slider 210 glo- pp. 275-284^(incorporated herein by reference) and Stein et 

bally alters the opacity of splats in the two-dimensional splat al., "Sorting and Hardware Assisted Rendering for Volume 

scatter plot 200. Thumbwheels Rotx and Roty rotate the 25 Visualization," IEEE 1995, pages 83-89 (incorporated 

image abojnJsQiizontaLand vertical axes respectively. Other herein by reference), 

controls (not shownJ^rrBmrpulaTing the ploTZOO such as, 4. Example GUI Computer Environment 

magnifying, reducing, or shifting the image can be used. FIG. 5 is a block diagram illustrating an example envi- 

Finally, a legend is provided to show what the opacity and ronment in which the present invention can operate. The 

color of the splats represent (e.g. opacity represents a count 30 environment is a computer system 500 that includes one or 

value and color represents an average value, 0-15 is mapped more processors, such as processor 504. Computer system 

to blue, 15-30 is mapped to green, 30-45 is mapped to 500 can include any type of general computer, 

yellow, and 45 and above is mapped to red). , The processor 504 is connected to a communications bus 

FIG. 3 shows an example color image of a three- 506. Various software embodiments are described in terms 

dimensional splat plot 300 that uses splats to visually 35 of this example computer system. This description is illus- 

approximate scattered data according to the present inven- trative and not intended to limit the present invention. After 

tion. Splat plot 300 was generated using the sample data reading this description, it will be apparent to a person 

described above with respect to Tables 1 and 2. FIG. 4 shows skilled in the relevant art how to implement the invention t 

a close-up view 400 of the color image of FIG. 3 which using other computer systems and/or computer architec-/ 

further shows an example of textured splats. 40 hires. If 

According to a further feature of the present invention, a Computer system 500 includes a graphics subsystem 503. 

dragger object is displayed to permit a user to select different Graphics subsystem 503 can be implemented as one or more 

regions inside the splat plot 400. The dragger object is processor chips. The graphics subsystem 503 can be 

shown in this example as a relatively opaque cylinder having included as part of processor 504 as shown in FIG. 5 or as 

reference axes that are parallel to the displayed scatter axes. 45 a separate graphics engine or processor. Graphics data is 

The dragger object can be manipulated by a user through a output from the graphics subsystem 503 to the bus 506. 

mouse or other user-interface control. Display interface 505 forwards graphics data from the bus 

Information about a selected region at which the dragger 506 for display on the display unit 506. 

object is located can then be displayed. This information can Computer system 500 also includes a main memory 508, 

include the values of the data attributes of the bin at or 50 preferably random access memory (RAM), and can also 

nearest to the selected region. See, e.g., the top window 450 include a secondary memory 510. The secondary memory 

in FIG. 4 which shows information on interior binned data 510 can include, for example, a hard disk drive 512 and/or 

points at the location of the dragger object, namely, longi- a removable storage drive 514, representing a floppy disk 

tude 6571.99-6576.81, latitude 22090.8-22099.1, depth bin drive, a magnetic tape drive, an optical disk drive, eic. The 

10107.3-10122.6. By moving and selecting different regions 55 removable storage drive 514 reads from and/or writes to a 

using the dragger object, a user can navigate inside a volume rejn^able_sj^ge_uniu518^^ 

rendered image. By reading window 450 a user can scan Removable storage unit 518 represents a floppy disk, mag^ 

information on interior binned regions. netic tape, optical disk, etc., which is read by and written to 

According to another embodiment of the present by removable storage drive 514. As will be appreciated, the 

invention, volume rendering involving ray tracing or cell 60 removable storage unit 518 includes a computer usable 

projection can be used to represent bins of aggregated data storage medium having stored therein computer software / 

points. For ray tracing, volumes (e.g. polygons) are rendered and/or data. % 

by using bin centers as vertices. The bin positions containing In alternative embodimentsrsecondary memory 510 may 

no data are assumed to have zero density (completely include other similar m e ans.fo fallowing .com puter.pro grams 

transparent). At bin positions where data is present, the 65 or oiherJ^tr^ionsT^eJoaded into co mputers ystem 500. 

density is directly proportional to the count of scattered data Such means can ihcluHe, "forTxarnpIeT a removable storage 

points. See, e.g., the use of ray tracing in volume rendering unit 522 and an interface 520. Examples can include a 
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program cartridge and cartridge interface (such as that found 
in video game devices), a removable memory chip (such as 
an EPROM, or PROM) and associated socket, and other 
removable storage units 522 and interfaces 520 which allow 
software and data to be transferred from the removable 5 
storage unit 522 to computer system 500. 

Computer system 500 can also include a communications 
interface 524. Communications interface 524 allows soft- 
ware and data to be transferred between computer system 
500 and external devices via communications path 526. 10 
Examples of communications interface 524 can include a 
modem, a network interface (such as Ethernet card), a 
communications port, etc. Software and data transferred via 
communications interface 524 are in the form of signals 
which can be electronic, electromagnetic, optical or other 15 
signals capable of being received by communications inter- 
face 524, via communications path 526. Note that commu- 
nications interface 524 provides a means by which computer 
system 500 can interface to a network such as the Internet. 

Graphical user interface module 530 transfers user inputs 20 
from peripheral devices 532 to bus 506. These peripheral 
devices 532 can be a mouse, keyboard, touch screen, 
microphone, joystick, stylus, light pen, or any other type of 
peripheral unit. These peripheral devices 532 enable a user 
to operate and control the data visualization tool of the 25 
present invention as described above. 

The present invention is described in terms of this 
example environment. Description in these terms is pro- 
vided for convenience only. It is not intended that the 
invention be limited to application in this example environ- 30 
ment. In fact, after reading the following description, it will 
become apparent to a person skilled in the relevant art how 
to implement the invention in alternative environments. 

The present invention is preferably implemented using 
software running (that is, executing) in an environment 35 
similar to that described above with respect to FIG. 5. In this 
document, the term "computer program product" is used to 
generally refer to removable storage unit 518 or a hard disk 
installed in hard disk drive 512. These computer program 
products are means for providing software to computer 40 
system 500. 

Computer programs (also called computer control logic) 
are stored in main memory and/or secondary memory 510. 
Computer programs can also be received via communica- 
tions interface 524. Such computer programs, when 45 
executed, enable the computer system 500 to perform the 
features of the present invention as discussed herein. In 
particular, the computer programs, when executed, enable 
the processor 504 to perform the features of the present 
invention. Accordingly, such computer programs represent 50 
controllers of the computer system 500. 

In an embodiment where the invention is implemented 
using software, the software may be stored in a computer 
program product and loaded into computer system 500 using 
removable storage drive 514, hard drive 512, or communi- 55 
cations interface 524. Alternatively, the computer program 
product may be downloaded to computer system 500 over 
communications path 526. The control logic (software), 
when executed by the processor 504, causes the processor 
504 to perform the functions of the invention as described eo 
herein. 

In another embodiment, the invention is implemented 
primarily in firmware and/or hardware using, for example, 
hardware components such as application specific integrated 
circuits (ASICs). Implementation of a hardware state 65 
machine so as to perform the functions described herein will 
be apparent to persons skilled in the relevant art(s). 
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5. Conclusion 

While various embodiments of the present invention have 
been described above, it should be understood that they have 
been presented by way of example only, and not limitation. 
It will be understood by those skilled in the art that various 
changes in form and details may be made therein without 
departing from the spirit and scope of the invention as 
defined in the appended claims. Thus, the breadth and scope 
of the present invention should not be limited by any of the 
above-described exemplary embodiments, but should be 
defined only in accordance with the following claims and 
their equivalents. 

What is claimed is: 

1. A method for visually approximating a scatter plot of 
data points, comprising the steps of: 

binning the data points into bins; 

determining a bin position for each bin; 

determining a count of data points in each bin; and 

rendering splats at bin positions of corresponding bins, 
each splat having an opacity that is a function of said 
count of data points in a corresponding bin, whereby, a 
splat plot can be displayed that visually approximates 
the scatter plot of data points. 

2. The method of claim 1, further comprising the step of: 
globally scaling the opacity of each splat. 

3. The method of claim 1, wherein said rendering step 
renders each splat having an opacity that is a function of said 
count of data points in a corresponding bin, according to the 
following equation: 

opacity* 1 - exp (-u " cou nt) , 

where, opacity represent the opacity value of a splat ai 
approximately a splat center, count represents said count of 
data points in a corresponding bin, u represents a global 
scale factor, and exp denotes an exponential function. 

4. The method of claim 1, further comprising the steps of: 
for each bin, determining an aggregate value of a variable 

associated with said data points in a respective bin; 
wherein said rendering step renders each splat with a 
respective color that is a function of said aggregate 
value determined for a corresponding bin, 

5. The method of claim 1, wherein said rendering step 
composites splats in a back- to-front order such that splats 
located farthest from a display screen are rendered before 
splats located closer to a display screen. 

6. The method of claim 1, wherein each bin position has 
at least one position coordinate, each position coordinate 
corresponds to an axis in the scatter plot, 

7. The method of claim 1, wherein said binning step 
comprises: 

discretizing each variable to be plotted along a respective 
axis in a splat plot according to a binning resolution. 

8. The method of claim 1, wherein each bin position 
includes two position coordinates corresponding to two 
respective variables associated with the data points, and 
further comprising the step of: 

displaying a two-dimensional splat plot that includes said 
rendered splats located at respective bin positions along 
two axes in said two-dimensional splat plot. 

9. The method of claim 1, wherein each bin position 
includes three position coordinates corresponding to three 
respective variables associated with the data points, and 
further comprising the step of: 

displaying a three-dimensional splat plot that includes 
said rendered splats located at respective bin positions 
along three axes in said three-dimensional splat plot. 
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10. The method of claim 1, wherein said rendering step 
texture maps said opacity across a polygon to represent a 
splat. 

11. The method of claim 1, further comprising the steps 

of: 

displaying a splat plot that includes said rendered splats; 
displaying a dragger object; 

permitting a user to select a region in the splat plot by 
moving said dragger object to said region; and 

displaying information about said region; whereby infor- 
mation on regions inside a volume rendered splat plot 
can be read. 

12. A system for visually approximating a scatter plot of 
data points, comprising: 

means for binning the data points into bins; 
means for determining a bin position for each bin; 
means for determining a count of data points in each bin; 
and 

means for rendering splats at bin positions of correspond- 
ing bins, each splat having an opacity that is a function 
of said count of data points in a corresponding bin, 
whereby, a splat plot can be displayed that visually 
approximates the scatter plot of data points. 

13. The system of claim 12, further comprising: 
means for globally scaling the opacity of each splat. 

14. The system of claim 12, wherein said rendering means 
renders each splat having an opacity that is a function of said 
count of data points in a corresponding bin, according to the 
following equation: 

opacity- l-«cp{-u* count), 

where, opacity represents the opacity value of a splat at an 
approximate splat center, count represents said count of 
aggregated data points in a corresponding bin, u represents 
a global scale factor, and exp denotes an exponential func- 
tion. 

15. The system of claim 12, further comprising: 
means for determining an aggregate value of a variable 

associated with said data points in a respective bin; 
wherein said rendering means renders each splat with a 
respective color that is a function of said aggregate 
value determined for a corresponding bin. 

16. The system of claim 12, wherein said rendering means 
composites splats in a back-to-front order such that splats 
located furthest from a display screen are rendered before 
splats located closer to a display screen. 

17. The system of claim 12, wherein each bin position has 
at least one position coordinate, each position coordinate 
corresponds to an axis in the scatter plot. 

18. The system of claim 12, wherein said means for 
binning comprises: 

means for discretizing each variable to be plotted along a 
respective axis in the splat plot according to a binning 
resolution. 
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19. The system of claim 12, wherein each bin position 
includes two position coordinates corresponding to two 
respective variables associated with the data points, and 
further comprising: 

5 means for displaying a two-dimensional splat plot that 
includes said rendered splats located at respective bin 
positions along two axes in said two-dimensional splai 
plot. 

20. The system of claim 12, wherein each bin position 
10 includes three position coordinates corresponding to three 

respective variables associated with the data points, and 
further comprising: 
means for displaying a three-dimensional splat plot that 
includes said rendered splats located at respective bin 
35 positions along three axes in said three-dimensional 
splat plot. 

21. The system of claim 12, wherein said rendering means 
comprises means for texture mapping said opacity across a 
polygon to represent a respective splat. 

20 22. The system of claim 12, further comprising: 

means for displaying a splat plot that includes said 

rendered splats; 
means for displaying a dragger object; 
25 means for permitting a user to select a region in the splat 
plot by moving said dragger object to said region; and 
means for displaying information about said region; 
whereby information on regions inside a volume ren- 
dered splat plot can be read. 
30 23. A computer program product comprising a computer 
useable medium having computer program logic recorded 
thereon for enabling a graphics processor in a computer 
system to visually approximate a scatter plot of data points, 
said computer program logic comprising: 
35 means for enabling the graphics processor to bin the data 
points into bins; 
means for enabling the graphics processor to determine a 

bin position for each bin; 
means for enabling the graphics processor to determine a 
40 count of data points in each bin; and 

means for enabling the graphics processor to render splals 
at bin locations of corresponding bins, each splat hav- 
ing an opacity that is a function of said count of data 
45 points in a corresponding bin, whereby, a splat plot can 
be displayed that visually approximates the scatter plot 
of data points. 

24. A method for visually approximating a scatter plot of 
data points, comprising the steps of: 

50 binning the data points into bins; 

determining a bin position for each bin; and 

volume rendering at bin positions of corresponding bins. I 

25. The method of claim 24, wherein said volume ren- 
dering step includes ray tracing. 

55 26. The method of claim 24, wherein said volume ren- 
dering step includes cell projection. 

***** 
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